ECB - Open Data¶

This tutorial was originally published on DataCareer.

In this Jupyter Notebook we will retrieve data from the European Central Bank (ECB). Before diving into the code, please take a quick look at the following websites, to get a feel for the data we will be dealing with:

EU portal: https://data.europa.eu/euodp/en/data/publisher/ecb

ECB SDMX 2.1 RESTful web service: https://sdw-wsrest.ecb.europa.eu/help/

SDMX documentation: https://sdmx.org

ECB statistics: https://www.ecb.europa.eu/stats/ecb_statistics/html/index.en.html

Statistical Data Warehouse: https://sdw.ecb.europa.eu

In this tutorial we will specifically take a look at foreign exchange (FX) rates, using Python 3. As always, let's start with importing some packages we will use for this exercise.

import requests     # 2.18.4
import pandas as pd # 0.23.0
import io

In this notebook, we will retrieve the Euro / Swiss Francs (EURCHF) exchange rate time series from the year 2000 until today. To retrieve the data, we need to construct an URL which we can use in a HTTP request. Fortunately, this is pretty simple.

The query string is basically:
protocol://wsEntryPoint/resource/flowRef/key?parameters
And the parameters are defined as follows: startPeriod=value&endPeriod=value&updatedAfter=value&firstNObservations=value&lastNObservations=value&detail=value&includeHistory=value

Let's break this down:

# Building blocks for the URL
entrypoint = 'https://sdw-wsrest.ecb.europa.eu/service/' # Using protocol 'https'
resource = 'data'           # The resource for data queries is always'data'
flowRef ='EXR'              # Dataflow describing the data that needs to be returned, exchange rates in this case
key = 'D.CHF.EUR.SP00.A'    # Defining the dimension values, explained below

# Define the parameters
parameters = {
    'startPeriod': '2000-01-01',  # Start date of the time series
    'endPeriod': '2018-10-01'     # End of the time series
}

Key (dimensions) explained:

the frequency of the measurement: D for daily
the currency being measured: CHF for Swiss Francs
the currency against which a currency is being measured: EUR for Euros
the type of exchange rates: foreign exchange reference rates have code SP00
the series variation (such as average or standardised measure for given frequency): code A

For this example we only use two parameters, startPerdiod and endPeriod, but you can add more if you like.

Now we have to put all this together to construct the URL:

# Construct the URL: https://sdw-wsrest.ecb.europa.eu/service/data/EXR/D.CHF.EUR.SP00.A
request_url = entrypoint + resource + '/'+ flowRef + '/' + key

# Make the HTTP request
response = requests.get(request_url, params=parameters)

# Check if the response returns succesfully with response code 200
print(response)

# Print the full URL
print(response.url)

<Response [200]>
https://sdw-wsrest.ecb.europa.eu/service/data/EXR/D.CHF.EUR.SP00.A?startPeriod=2000-01-01&endPeriod=2018-10-01

The request/response has been succesful. If you click on this link, it will download the data as a file. It doesn't have a useful extension, but you can open it as a text file if you would like to inspect it's contents. But of course we dan do this in Python too without leaving this notebook. Let's take a sneak peek at the data we received.

# Print the first 1000 characters of the response
print(response.text[0:1000])

<?xml version="1.0" encoding="UTF-8"?><message:GenericData xmlns:message="http://www.sdmx.org/resources/sdmxml/schemas/v2_1/message" xmlns:common="http://www.sdmx.org/resources/sdmxml/schemas/v2_1/common" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:generic="http://www.sdmx.org/resources/sdmxml/schemas/v2_1/data/generic" xsi:schemaLocation="http://www.sdmx.org/resources/sdmxml/schemas/v2_1/message https://sdw-wsrest.ecb.europa.eu:443/vocabulary/sdmx/2_1/SDMXMessage.xsd http://www.sdmx.org/resources/sdmxml/schemas/v2_1/common https://sdw-wsrest.ecb.europa.eu:443/vocabulary/sdmx/2_1/SDMXCommon.xsd http://www.sdmx.org/resources/sdmxml/schemas/v2_1/data/generic https://sdw-wsrest.ecb.europa.eu:443/vocabulary/sdmx/2_1/SDMXDataGeneric.xsd">
<message:Header>
<message:ID>11e396e5-4d14-49ae-bdf4-fad98a32e113</message:ID>
<message:Test>false</message:Test>
<message:Prepared>2019-05-26T13:03:53.512+02:00</message:Prepared>
<message:Sender id="ECB"/>
<message:Structure structureID="

Hmmm, the response is in XML. Not impossible, but also not the easiest format to work with in Pandas. Fortunately the ECB's API let's us get the data in CSV format by specifying it in the header of the request.

# Make the HTTP request again, now requesting for CSV format
response = requests.get(request_url, params=parameters, headers={'Accept': 'text/csv'})
# Response succesful? (Response code 200)
print(response)

<Response [200]>

# Print the first 1000 characters to inspect the response
response.text[0:1000]

'KEY,FREQ,CURRENCY,CURRENCY_DENOM,EXR_TYPE,EXR_SUFFIX,TIME_PERIOD,OBS_VALUE,OBS_STATUS,OBS_CONF,OBS_PRE_BREAK,OBS_COM,TIME_FORMAT,BREAKS,COLLECTION,COMPILING_ORG,DISS_ORG,DOM_SER_IDS,PUBL_ECB,PUBL_MU,PUBL_PUBLIC,UNIT_INDEX_BASE,COMPILATION,COVERAGE,DECIMALS,NAT_TITLE,SOURCE_AGENCY,SOURCE_PUB,TITLE,TITLE_COMPL,UNIT,UNIT_MULT\r\nEXR.D.CHF.EUR.SP00.A,D,CHF,EUR,SP00,A,2000-01-03,1.6043,A,,,,P1D,,A,,,,,,,,,,4,,4F0,,Swiss franc/Euro,"ECB reference exchange rate, Swiss franc/Euro, 2:15 pm (C.E.T.)",CHF,0\r\nEXR.D.CHF.EUR.SP00.A,D,CHF,EUR,SP00,A,2000-01-04,1.6053,A,,,,P1D,,A,,,,,,,,,,4,,4F0,,Swiss franc/Euro,"ECB reference exchange rate, Swiss franc/Euro, 2:15 pm (C.E.T.)",CHF,0\r\nEXR.D.CHF.EUR.SP00.A,D,CHF,EUR,SP00,A,2000-01-05,1.606,A,,,,P1D,,A,,,,,,,,,,4,,4F0,,Swiss franc/Euro,"ECB reference exchange rate, Swiss franc/Euro, 2:15 pm (C.E.T.)",CHF,0\r\nEXR.D.CHF.EUR.SP00.A,D,CHF,EUR,SP00,A,2000-01-06,1.6068,A,,,,P1D,,A,,,,,,,,,,4,,4F0,,Swiss franc/Euro,"ECB reference exchange rate, Swiss franc/Euro, '

Excellent! Now we only need to load this response in a Pandas DataFrame. We can use 'StringIO' to read the strings as a file. This way we don't need to save it first and we can use it directly.

# Read the response as a file into a Pandas DataFrame
df = pd.read_csv(io.StringIO(response.text))

# Check the DataFrame's information
df.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 4859 entries, 0 to 4858
Data columns (total 32 columns):
KEY                4859 non-null object
FREQ               4859 non-null object
CURRENCY           4859 non-null object
CURRENCY_DENOM     4859 non-null object
EXR_TYPE           4859 non-null object
EXR_SUFFIX         4859 non-null object
TIME_PERIOD        4859 non-null object
OBS_VALUE          4798 non-null float64
OBS_STATUS         4859 non-null object
OBS_CONF           238 non-null object
OBS_PRE_BREAK      0 non-null float64
OBS_COM            0 non-null float64
TIME_FORMAT        4859 non-null object
BREAKS             0 non-null float64
COLLECTION         4859 non-null object
COMPILING_ORG      0 non-null float64
DISS_ORG           0 non-null float64
DOM_SER_IDS        0 non-null float64
PUBL_ECB           0 non-null float64
PUBL_MU            0 non-null float64
PUBL_PUBLIC        0 non-null float64
UNIT_INDEX_BASE    0 non-null float64
COMPILATION        0 non-null float64
COVERAGE           0 non-null float64
DECIMALS           4859 non-null int64
NAT_TITLE          0 non-null float64
SOURCE_AGENCY      4859 non-null object
SOURCE_PUB         0 non-null float64
TITLE              4859 non-null object
TITLE_COMPL        4859 non-null object
UNIT               4859 non-null object
UNIT_MULT          4859 non-null int64
dtypes: float64(15), int64(2), object(15)
memory usage: 1.2+ MB

You can see there are 4858 rows, which makes sense for almost 19 years of daily prices. There are 32 columns, which is probably more than we need. We can inspect the DataFrame to see which columns we need.

# Show the last 5 entries of the DataFrame
df.tail()

The columns we need are 'TIME_PERIOD' for the dates and 'OBS_VALUE' for the prices. Let's also do a sanity check on the prices in 'OBS_VALUE'.

# Inspect the prices. Do the mean, minimum and maximum make sense?
df['OBS_VALUE'].describe()

count    4798.000000
mean        1.379459
std         0.193938
min         0.981600
25%         1.205500
50%         1.467300
75%         1.547200
max         1.680300
Name: OBS_VALUE, dtype: float64

Now we will make a new DataFrame called 'ts' (for time series) with just the dates and EURCHF prices.

# Create a new DataFrame called 'ts'
ts = df.filter(['TIME_PERIOD', 'OBS_VALUE'], axis=1)
# 'TIME_PERIOD' was of type 'object' (as seen in df.info). Convert it to datetime first
ts['TIME_PERIOD'] = pd.to_datetime(ts['TIME_PERIOD'])
# Set 'TIME_PERIOD' to be the index
ts = ts.set_index('TIME_PERIOD')
# Print the last 5 rows to screen
ts.tail()

To conclude, we make a chart of the time series:

%matplotlib inline
ts.plot()

<matplotlib.axes._subplots.AxesSubplot at 0x115fc3f60>

BONUS: pandaSDMX¶

Very often, there are already convenient Python packages available on the internet you can use. For instance, if you search the internet for "Python ECB SDMX" you will undoubtly find the 'pandaSDMX' package. It is, as they call it, a "Python client to retrieve and acquire statistical data and metadata disseminated in SDMX 2.1, an ISO-standard widely used by institutions such as statistics offices, central banks, and international organisations."

Interestingly, pandaSDMX ships with built-in support for the following agencies:

Australian Bureau of Statistics (ABS)
European Central Bank (ECB)
Eurostat
French National Institute for Statistics (INSEE)
Instituto Nacional de la Estadìstica y Geografìa - INEGI (Mexico)
International Monetary Fund (IMF) - SDMX Central only
International Labour Organization (ILO)
Italian statistics Office (ISTAT)
Norges Bank (Norway)
Organisation for Economic Cooperation and Development (OECD)
United Nations Statistics Division (UNSD)
UNESCO (free registration required)
World Bank - World Integrated Trade Solution (WITS)

More information can be found at: https://pandasdmx.readthedocs.io/en/latest/index.html

In this last section, we are going to retrieve the same data (EURCHF daily exchange rates) with this package. First though, you need to install the package via 'pip' (via the command line):

pip install pandasdmx

Now let's get the data!

from pandasdmx import Request

# Define the source
ecb = Request('ECB')

# Retrieve the data (we start at 2016, because are requesting a larger dataset (including other frequencies))
data_response = ecb.data(resource_id = 'EXR', key={'CURRENCY': ['CHF', 'EUR']}, params = {'startPeriod': '2016'})
data = data_response.data

# The data will be a pandaSDMC 'DataSet'
type(data)

pandasdmx.model.DataSet

# Show which frequencies are available ('D' is 'daily', you can probably guess the other ones)
set(s.key.FREQ for s in data.series)

{'A', 'D', 'H', 'M', 'Q'}

# Filter the the daily data and 'write' it to a DataFrame
daily = (s for s in data.series if s.key.FREQ == 'D')
ts2 = data_response.write(daily)
ts2.tail()

%matplotlib inline
ts2.plot()

<matplotlib.axes._subplots.AxesSubplot at 0x11aed15f8>

As you can see, the pandaSDMX pacakage is another easy way to retrieve the same data. Hopefully these examples will help you get started. Thanks for reading!

	OBS_VALUE
TIME_PERIOD
2018-09-25	1.1376
2018-09-26	1.1369
2018-09-27	1.1371
2018-09-28	1.1316
2018-10-01	1.1414

FREQ	D
CURRENCY	CHF
CURRENCY_DENOM	EUR
EXR_TYPE	SP00
EXR_SUFFIX	A
TIME_PERIOD
2019-05-20	1.1263
2019-05-21	1.1274
2019-05-22	1.1252
2019-05-23	1.1224
2019-05-24	1.1215

	KEY	FREQ	CURRENCY	CURRENCY_DENOM	EXR_TYPE	EXR_SUFFIX	TIME_PERIOD	OBS_VALUE	OBS_STATUS	OBS_CONF	...	COMPILATION	COVERAGE	DECIMALS	NAT_TITLE	SOURCE_AGENCY	SOURCE_PUB	TITLE	TITLE_COMPL	UNIT
4854	EXR.D.CHF.EUR.SP00.A	D	CHF	EUR	SP00	A	2018-09-25	1.1376	A	F	...	NaN	NaN	4	NaN	4F0	NaN	Swiss franc/Euro	ECB reference exchange rate, Swiss franc/Euro,...	CHF
4855	EXR.D.CHF.EUR.SP00.A	D	CHF	EUR	SP00	A	2018-09-26	1.1369	A	F	...	NaN	NaN	4	NaN	4F0	NaN	Swiss franc/Euro	ECB reference exchange rate, Swiss franc/Euro,...	CHF
4856	EXR.D.CHF.EUR.SP00.A	D	CHF	EUR	SP00	A	2018-09-27	1.1371	A	F	...	NaN	NaN	4	NaN	4F0	NaN	Swiss franc/Euro	ECB reference exchange rate, Swiss franc/Euro,...	CHF
4857	EXR.D.CHF.EUR.SP00.A	D	CHF	EUR	SP00	A	2018-09-28	1.1316	A	F	...	NaN	NaN	4	NaN	4F0	NaN	Swiss franc/Euro	ECB reference exchange rate, Swiss franc/Euro,...	CHF
4858	EXR.D.CHF.EUR.SP00.A	D	CHF	EUR	SP00	A	2018-10-01	1.1414	A	F	...	NaN	NaN	4	NaN	4F0	NaN	Swiss franc/Euro	ECB reference exchange rate, Swiss franc/Euro,...	CHF